A suffix tree approach to anti-spam email filtering
نویسندگان
چکیده
منابع مشابه
A Suffix Tree Approach to Email Filtering
We present an approach to email filtering based on the suffix tree data structure. A method for the scoring of emails using the suffix tree is developed and a number of scoring and score normalisation functions are tested. Our results show that the character level representation of emails and classes facilitated by the suffix tree can significantly improve classification accuracy when compared ...
متن کاملBoosting Trees for Anti-Spam Email Filtering
This paper describes a set of comparative experiments for the problem of automatically filtering unwanted electronic mail messages. Several variants of the AdaBoost algorithm with confidence– rated predictions (Schapire & Singer 99) have been applied, which differ in the complexity of the base learners considered. Two main conclusions can be drawn from our experiments: a) The boosting–based met...
متن کاملA Suffix Tree Approach to Text Categorisation Applied to Spam Filtering
We present an approach to textual classification based on the suffix tree data structure and apply it to spam filtering. A method for scoring of documents using the suffix tree is developed and a number of scoring and score normalisation functions are tested. Our results show that the character level representation of documents and classes facilitated by the suffix tree significantly improves c...
متن کاملA Three-Way Decision Approach to Email Spam Filtering
Many classification techniques used for identifying spam emails, treat spam filtering as a binary classification problem. That is, the incoming email is either spam or non-spam. This treatment is more for mathematical simplicity other than reflecting the true state of nature. In this paper, we introduce a three-way decision approach to spam filtering based on Bayesian decision theory, which pro...
متن کاملCombining SVM Classifiers for Email Anti-spam Filtering
Spam, also known as Unsolicited Commercial Email (UCE) is becoming a nightmare for Internet users and providers. Machine learning techniques such as the Support Vector Machines (SVM) have achieved a high accuracy filtering the spam messages. However, a certain amount of legitimate emails are often classified as spam (false positive errors) although this kind of errors are prohibitively expensiv...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Machine Learning
سال: 2006
ISSN: 0885-6125,1573-0565
DOI: 10.1007/s10994-006-9505-y